Evolutionary prisoner's dilemma game on a square lattice 
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A simplified prisoner's game is studied on a square lattice when the players interacting with their 
neighbors can follow two strategies: to cooperate (C) or to defect (D) unconditionally. The players 
updated in random sequence have a chance to adopt one of the neighboring strategies with a probability 
depending on the payoff difference. Using Monte Carlo simulations and dynamical cluster techniques we 
study the density c of cooperators in the stationary state. This system exhibits a continuous transition 
between the two absorbing state when varying the value of temptation to defect. In the limits c — > 
and I we have observed critical transitions belonging to the universality class of directed percolation. 

PACS numbers: 02.50.Le, 05.50.+q, 05.50.+j, 64.60.Ht 



I. INTRODUCTION 

The evolutionary prisoner's dilemma games were intro- 
duced by Axelrod M to study the emergence of cooper- 
ation rather than exploitation among selfish individuals. 
Since the pioneering work of Axelrod this approach has 
become a fruitful tool in the area of political and behavior 
sciences, biology and economics 

In the prisoner's dilemma (PD) game each of two play- 
ers have to decide simultaneously whether it wishes to 
cooperate with the other or to defect. The rewards de- 
pendent on their choices are expressed by 2 x 2 payoff ma- 
trices in agreement with the four possibilities. Assuming 
symmetric game the players get rewards R (P) if both 
choose to cooperate (defect). In the remaining two cases 
the defector's and cooperator's payoff are T (temptation 
to defect) and S (sucker's payoff) respectively. The el- 
ements of payoff matrix satisfy the following conditions: 
T > R > P > S and 2R > T + S. In this game the 
mutual cooperation leads to the highest total (average) 
payoff. The highest individual payoff (T) can only be 
reached against the other player decreasing the average 
payoff. These features makes the PD game so interesting 
in the mentioned areas. 

In earlier studies N contestants played an iterated 
round-robin prisoner's dilemma game. The population of 
contestants, which apply different algorithms to choose 
between defection and cooperation in the knowledge of 
previous decisions, was modified according to a Dar- 
winian selection rule round by round. For example, elim- 
inating the worst player the best one will have an off- 
spring inheriting the parent's strategy. In a different in- 
terpretation, the worst player adopts the best algorithm. 
Computer tournaments (simulations) were performed to 
study how the population of contestants varies with time 
Evidently, the final (stationary) state depends on 
the initial population. The simulations have clarified the 
emergence of mutual cooperation among all the players 
under some conditions. In these tournaments the winner 
the so-called 'Tit for Tat' (TFT) algorithm has a crucial 



role. This very simple algorithm cooperates in the first 
round and later it reciprocates the partner's previous de- 
cision. It forces the players to cooperate mutually and 
maintains this state against defectors. 

Beside the homogeneous system with players following 
TFT algorithm, the state where all the players choose 
to defect has proved to be stationary too, more precisely, 
spare cooperators will be suppressed due to the evolution- 
ary rule in the large N limit. More precisely, only a suf- 
ficiently large portion of mutual cooperators can survive 
among defectors. The emergence of uniform cooperation 
becomes easier when - combining the evolutionary game 
with spatial effects - the players interact much more with 
their neighbors than with those who are far away as it is 
typical in real populations. The spatial effects promote 
the survival of cooperators even if we do not use any kind 
of elaborate strategies such as the TFT. 

Recently Nowak and May j5j have introduced a spatial 
evolutionary PD game. In this model individuals located 
on a lattice play with their neighbors and with them- 
self. The strategical complexities and memories of past 
encounters are neglected by considering only two sim- 
ple kinds of individuals: those who cooperate (C) and 
those who defect (D) unconditionally. The evolutionary 
rule was also simplified by using discrete time steps. Be- 
tween two rounds individuals adopt the strategy that has 
received the highest payoff among its neighbors includ- 
ing themself. This deterministic model is equivalent to 
a two-state cellular automaton where the next state at 
a given lattice point is determined by the states on the 
surrounding points. The outcome depends on the initial 
configuration and the rescaled payoff matrix described 
by a single parameter b characterizing the measure of 
temptation to defect (see the matrix in Sec. 2). This 
model with and without self-interaction was investigated 
on different lattice structures (square, triangle, cubic). 
The most exhaustive analysis is performed on a square 
lattice taking into account the interactions with the first 
and second neighbors and self-interaction. Nowak and 
May observed a rich variety of spatial and temporal dy- 
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namics dependent on the value of b. For example, the co- 
operators can invade the word of defectors along straight 
border lines while defectors gain along irregular bound- 
aries for a given interval of b. Furthermore, the above 
rules conserve the symmetries of the initial state for ad- 
equate boundary conditions. Due to the discrete nature 
of total payoff there appear sharp steps when varying b. 

Introducing stochastic evolutionary rules between two 
rounds Nowak et al. Q have extended the above model. 
Although the stochasticity simplifies the dynamics it 
does not change the basic observations that cooperators 
and defectors can coexist. The randomness destroys the 
straight border lines as well as other symmetries that 
appear in the deterministic model. 

Hubermann and Glance @ have studied a similar 
model using continuous time simulations where players 
are chosen randomly and immediately updated. Their 
results support that the above conclusions are not af- 
fected by whether we use continuous or discrete time in 
the stochastic case ||. 

In the present work we study a PD game with a slightly 
different continuous time evolution on square lattice. In 
the modified model the players need less intelligence to 
decide whether they adopt one of the neighboring strat- 
egy or not. Using systematic Monte Carlo simulations 
and generalized mean-field techniques we calculate the 
density of cooperators as a function of b for different noise 
levels. It will be shown that the transitions from the ac- 
tive state (coexistence of defectors and cooperators) to 
the absorbing ones (all D or all C) exhibit universal be- 
havior. 



II. THE MODEL 

The players located on a square lattice can follow only 
two simple strategies: C (always cooperate) and D (al- 
ways defect). Due to this simplification this system can 
be handled with the Ising formalism and we can use the 
sophisticated techniques developed in non-equilibrium 
statistical physics. Each player plays a PD game with 
itself and with its neighbors. The total payoff of a cer- 
tain player is the sum over all interactions. The elements 
of payoff matrix can be rescaled because the evolutionary 
rule depends on the payoff differences between the play- 
ers. Accepting the idea suggested by Nowak and May ||] 
we choose R = 1, P = S = and T = b. Thus, the 
payoff to player A against B is given by the matrix: 
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where b > 1. 

Two systems will be considered subsequently In the 
first case only the first neighbors are taken into account. 
This means that the total payoff of a defector surrounded 
by cooperators is 46 while the cooperator's payoff is 5 in 



the same surroundings. In the second case the neighbor- 
hood includes the first- and second-neighbors. Thus the 
payoffs of the defector and cooperator are 8b and 9 in the 
sea of cooperators. 

The randomly chosen player X revises its strategy ac- 
cording to the following rules. This player selects one of 
its neighbours Y with equal probability. Given the total 
payoffs (Ex and Ey) from the previous round, player X 
adopts the neighbor's strategy with the probability: 



l + cxp[-{E Y -E x )/K} y ' 

where Ey is the neighbor's payoff and K characterizes 
the noise introduced to permit irrational choices. For 
successful strategy adoptation the new state as well as 
the new payoffs are updated. Notice that the decision 
is not affected by the variation of total payoff involving 
the change in the surroundings. Starting from a random 
initial state the above process is repeated many times. 

For K = the player X adopts Y~'s strategy if Ey > 
Ex- In this case the randomness is represented by the 
selection of the players X and Y. The finite value of K 
characterizes the range of payoff difference within which 
the irrational decision can typically appear. At present, 
our analysis is constrained to noise levels K < 1. 

Monte Carlo (MC) simulations are performed varying 
the value of b for fixed K values. We have determined the 
density c of cooperators using periodic boundary condi- 
tions. The system size was varied from L = 200 to 1000; 
the large sizes are required to suppress the statistical er- 
ror in the critical regions (c — ► or I). 

The above models are also investigated by the gen- 
eralized mean-field technique that proved to be very 
efficient for studying dynamical systems such as the 
one-dimensional stochastic cellular automata [[s|-|Ic| and 
driven lattice gases In fact the introduction of 

the above evolutionary rule is motivated by the demand 
to make the model more convenient for this method. In 
the present case we have adapted the two-dimensional 
method to determine the probability of the configura- 
tions appearing on 2-, 4-, 5- and 6-point clusters |U5). It 
is expected that the larger the cluster we use the more 
accurate the prediction given by this technique. At the 
level of 6-point approximation - taking the consistency 
conditions and symmetries into account - we have to de- 
termine 20 parameters by solving a set of equation of mo- 
tions for the configuration probabilities in the stationary 
state. Details of this calculations are given in previous 
papers 

III. RESULTS 

For both models the c = (all D) and 1 (all C) states 
are independent of time because the evolutionary rule 
cannot create a new strategy which can spread away un- 
der advantageous conditions. The uniform cooperation 
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(c = 1) is a stable state if b does not exceed a threshold 
value b c \ which is larger than 1. This means that any 
constellation of defectors will be defeated if b < b c \. In 
the same way the c = state remains stable for b > b C 2- 
Henceforth we will concentrate on those states which the 
cooperators and defector can coexist in, that is, when 
bd < b < b C 2- 
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FIG. 1. Density of cooperators as a function of tempta- 
tion to defect for K = 0.1. The MC data are plotted by 
squares, the results of generalized mean-field technique for 
different cluster sizes are indicated by long-dashed (2), dot- 
ted (4), dashed (5) and solid (6) lines. 

First we consider the model with first-neighbor inter- 
actions. Figure || shows the 6-dependence of the density 
c of cooperators in the coexistence region for K = 0.1. 
As indicated c decreases monotonously with increasing 
b until the second threshold b C 2 where the cooperators 
vanish. 




The results of both the MC simulations and the gener- 
alized mean-field method refer to step-like behavior be- 
coming more and more striking if we decrease the value 
of K. The sharp steps appear at the break points (e.g. 
b = 4/3, 3/2) described by Nowak and May (5). Inside 
the coexistence region the mean- field results of 4-, 5- and 
6-point approximations agree satisfactorily with the sim- 
ulations while the pair approximation yields well-marked 
difference. The best agreement is found for the 5-point 
approximations (dashed line). 

A typical snapshot on the steadt state distribution of 
cooperators and defectors is illustrated in Fig. || for 6 = 
0.4 and T = 0.1. This snapshot as well as the subsequent 
ones are a 100 x 100 portion of the full 400 x 400 lattice. 
In this case the pair- (2-point) approximation gives a 
satisfactory description of the short-range correlations. 

Notice furthermore that the mean-field predictions are 
not adequate when c tends either to or 1. Namely, 
the 4- and 6-point approximations predict a continuous 
(linear) transition, the 5-point approximation indicates a 
first-order one while the simulations suggest power law 
behavior if c — > 0. Similar situation has already been ob- 
served for a one-dimensional stochastic cellular automa- 
ton Jio| . The mentioned deviations are not surprising be- 
cause the mean-field approximations are not capable to 
handle the critical transitions exhibiting enhanced fluc- 
tuations and long-range correlations. 

In the limit c — > the cooperators can survive if they 
form scattered colonies in the background of defectors 
as illustrated in Fig. ^. In general, any compact colony 
formation would be more preferable for cooperators, how- 
ever, the defectors make them rare. 




FIG. 2. Distribution of cooperators (white boxes) and de- 
fectors (black boxes) for b = 1.4 and T = 0.1 (c = 0.515). 



FIG. 3. The cooperators (white boxes) form colonies in the 
sea of defectors (black boxes) when their density goes to 0. 

Visualizing the time-dependent configuration one can 
observe how the colonies try to spread away. Their 
center, size and shape change continuously and a sep- 
arated colony can disappear without trace. Two colonics 
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can unite providing better opportunity for their survival, 
or oppositely, a colony can divide into two (or more) 
parts. Similar phenomena can be observed in a wide 
range of dynamical processes described by the directed 
percolation (DP) jl4l, the Reggeon field theory [Tq] , the 
surface-reaction |l6| and Schlogl models Jl7j as well as the 
branching and annihilating random walks Jl8[ . Grass- 
berger and Janssen |2(| conjectured that all one- 
component models with a single absorbing state belong to 
the universality class of directed percolation. Exceptions 
can appear if the dynamics conserves some symmetries 
(e.g. parity of offsprings). 

Our MC data (shown in Fig. [I]) refer to a power law 
behavior, that is 



c oc (b c2 - b) 



(2) 



if b -> b c2 . The best fit is obtained for b c2 = 1.8472(1) 
and f3 — 0.56(3) which is consistent with the critical ex- 
ponent ((3 k 0.58) of the two-dimensional directed per- 
colation Ell. 



FIG. 4. Typical snapshot for high concentrat ion of coop- 
erators (white boxes) for b = 1.222 and T = 0.1. The small 
'gangs' of defectors (black boxes) walk randomly. 

Contrary to the above pattern defectors form small 
isolated 'gangs' as demonstrated in Fig. |^ for a typical 
stationary state if 1 — c -C 1. A single defector sur- 
rounded by cooperators has the highest payoff (fitness) 
in this system. Sooner or later this defector will have 
a neighboring offspring which reduces its payoff immedi- 
ately. (This process can be considered as a retaliation 
executed by the TFT algorithm if more elaborate strate- 
gies are permitted.) If b < 4/3 then one of the defectors 
will be defeated within a short time. The iteration of 
this process yields randomly walking gangs. Two collid- 
ing gangs can unite into one. Due to the possibility of 
irrational choices a single gang can divide into two or 
can disappear. Shortly, the gangs can be considered as 



branching and annihilating random walkers whose criti- 
cal behavior belongs to the DP universality class too. 

In the deterministic model introduced by Nowak and 
May H isolated gangs with fixed positions can occur if 
1 < b < 4/3. The density of gangs (whose size alternates 
cyclically if 5/4 < b < 4/3) depends on the initial state. 
In contrary to this feature the homogeneous cooperation 
can emerge in the stochastic models even for b > 1 as a 
consequence of the random walk and annihilation. Be- 
sides, the random walk causes the steady state density 
to be independent of the initial state. 

Despite the mentioned expectation the MC data in 
Fig. |l| do not show any power law behavior in the limit 
c — » 1. This discrepancy can be resolved by reminding 
the reader that the critical behavior is controlled by a 
simple function of the diffusion constant and the rates 
of branching and annihilation. In the present case these 
parameters are strongly non-linear functions of b at low 
K. For higher value of K, however, the non-linear con- 
tributions are negligible in the close vicinity of b c \ and 
we expect the power law behavior to appear clearly. In 
order to check this statement we have repeated the same 
analyses at higher noise level. 
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FIG. 5. Density of cooperators vs. b for K — 0.5 
as suggested by MC simulations (squares) and generalized 
mean-field approximations whose level is indicated as in 
Fig. 1. 

The results obtained for K = 0.5 are summarized in 
Fig. ||. As expected the MC data show power law behav- 
ior for both the limits c — > and 1. Detailed numerical 
analysis results in b cl = 1.2687, (3 = 0.62(5) if c -> 
and b c2 = 1.6644(2), (3 = 0.59(3) if c -»• 0. These (3 val- 
ues agree satisfactorily with the corresponding exponent 
of DP universality class. Notice, furthermore, that b c \ 
and b c2 depend on K. The determination of a K — b 
phase diagram indicating the active and absorbing states 
goes beyond the purpose of the present work. Instead of 
it we have studied the model involving second-neighbor 
interactions. 
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The generalization of our techniques to investigate the 
density of cooperators in the second model is straight- 
forward. The results of these calculations (see Fig. ||) 
refer to a behavior similar to those of the previous ver- 
sion. There are some minor differences. For example, 
the threshold values [b c \ and b C 2) are definitely smaller 
than those of the previous model. Furthermore, the con- 
vergence of the results of generalized mean-field approx- 
imation is slow. This fact indicates that the short-range 
correlations become more relevant if we take the second- 
neighbor interactions into account. 



o 0.5 




FIG. 6. Density of cooperators as a function of b for the 
model taking the second neighbor interactions into account 
at K — 0.02. Results are indicated as in Fig. 1. 

The steps of the continuous c(b) function (for b = 8/7, 
7/6, 6/5 and 5/4) becomes sharper when decreasing the 
value of K . For high noise levels the function becomes 
smooth and exhibits power law behavior with exponents 
close to the DP value at both ends of the active region. 
Inside the active phase the difference between the mean- 
field results and MC simulations decreases with increas- 
ing K. 



IV. CONCLUSIONS 

We have studied the evolution of cooperation among 
players who can follow only two simple strategies (C and 
D) and are placed on a square lattice. The individual re- 
ceives payoffs from interactions with each of its neighbors 
and itself in a PD confrontation. An evolutionary rule is 
introduced by slightly modifying the model suggested by 
Nowak et al || . Namely, a randomly chosen player is to 
adopt one of its neighboring strategy with a probability 
dependent on the payoff difference. Two versions of the 
model have been investigated. In the first case the neigh- 
borhood is limited to the first-neighbors. In the second 
case we have increased the number of neighbors by taking 
into consideration the second-neighbors too. 



The MC simulations have given direct evidence of the 
existence of two absorbing states (c = 1 if b < b c i and 
c = if b > b C 2j. It is remarkable that the homogeneous 
cooperation proved to be stable against the temptation 
to defect for 1 < b < b c \ due to the randomness and pos- 
sibility of irrational choice. We have found significantly 
different (if-dependent) threshold values in the models 
we are interested in. It is expected that b c \ tends to 1 if 
we increase the number of neighbors. 

For high density of defectors the cooperators form- 
ing compact blocks can spread if b < b C 2- Comparing 
the present models with the corresponding determinis- 
tic versions |5| we can state that the active region is 
reduced by the stochasticity. For example, in the de- 
terministic version of our second model a competition 
between the C and D invasion processes can be observed 
if 9/5 < b < 2 because the cooperators invade along 
straight lines meanwhile the defectors win along irreg- 
ular boundaries. In this parameter range Mukherji et 
al. |22| have observed that the cooperation is eliminated 
when introducing stochastic elements. This is not sur- 
prising because the C invasion along straight lines is not 
permitted in the stochastic models. At lower value of b, 
however, the spatial effects can facilitate the survival of 
cooperators |6|j23[]. In the present stochastic model the 
second threshold value of b is decreased by the random- 
ness, namely, we have found b C 2 < 1.4 for K = 0.02, 0.1 
and 0.5 . 

The generalized mean-field approximations have clari- 
fied the importance of short range correlations for both 
versions of the stochastic evolutionary PD game inside 
the coexistence region. Unfortunately, this technique is 
not applicable in the critical regions (c — * and 1) where 
long-range correlations and fluctuations play a dominant 
role. 

In these critical regions the MC simulations indicated 
clearly power law behavior, namely c cx (b C 2 — b) 13 and 
1 — c oc (b — b c i)P) at sufficiently high noise levels. The 
values of (5 deduced from the MC data agree well with the 
DP exponent for both versions. These findings corrobo- 
rate the conjecture according to which the transitions in 
all one-component models to an absorbing state belong 
to the DP universality class in the absence of conserved 
symmetries. The curiosity of the present model is that 
here we have two different (non-symmetric) absorbing 
states whose stability regions are separated by the active 
phase. For low values of K the appearance of power law 
behavior against b is distorted by the strongly nonlinear 
^-dependence of the diffusion and annihilation. Due to 
the robustness of DP universality class similar critical be- 
havior is expected for many other versions of stochastic 
evolutionary rules. 
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